Method and system for determining geographical regions of hosts in a network

ABSTRACT

Methods and systems are provided for determining the geographical regions of hosts in a network. A plurality of sample hosts in the network are preselected. The sample hosts are preselected such that they are located in a plurality of geographical regions that are determinable using existing methods and systems or other means. A plurality of monitoring stations are provided in the network to determine first sets of information associated with each of the sample hosts and second sets of information associated with a host whose geographical region is to be determined. The geographical region of the host is then determined to be the same as the geographical region of those sample hosts whose respective mean of first sets of information has the shortest weighted vector (or euclidian) distance from the second sets of information.

BACKGROUND OF THE INVENTION

[0001] The present invention relates generally to communicationnetworks, and more particularly, to methods and systems for determiningthe geographical location of a host in a public network, such as theInternet.

[0002] Presently, there are few tools that purportedly determinegeographical locations of hosts, such as servers, routers, or any otherprocessor with an identifiable network address in the Internet. One suchtool is NetGeo, which is available through Cooperative Association ofInternet Data Analysis (CAIDA) and may be accessed on-line atwww.caida.org. These tools typically determine the geographicallocations of hosts by parsing their associated whois records. Toretrieve a whois record for a host, these tools invoke a whois command,which returns a record including a postal address of the entity to whichthe host belongs.

[0003] Most existing tools, such as NetGeo may provide reliablegeographical information for some hosts, such as Internet routers, butfail to provide reliable geographical information for all hosts in theInternet. For example, although a whois record includes the postaladdress of the entity to which a host belongs, this postal address maynot be the actual location of the host, especially if the host belongsto a large business, corporation, or an international Internet ServiceProvider (ISP). For such hosts, the whois record typically includes theregistered postal address of the entity managing or controlling a host,which is likely to be the business address of the entity's headquartersand not the actual location of the host. Accordingly, the registeredpostal addresses in whois records are unlikely to be the actualaddresses of such hosts.

[0004] However, if an entity is a small business or a university, thenit is likely that the host in question may be located at the registeredpostal address in the associated whois record for that host. Forexample, “whois -h rs.intemic.net monmouth.com” returns a record thatincludes an address registered in the “rs.internic.net” database for asecond level domain name “monmouth.com.” If the domain name monmouth.combelongs to a small local business, it may be inferred that a host with aname, such as “shell.monmouth.com” is located in the county of Monmouthin the state of New Jersey.

[0005] Accordingly, although most existing tools may provide reliablegeographical information about hosts belonging to small entities oruniversities, these tools cannot provide reliable geographicalinformation for most hosts, especially if the hosts are managed orcontrolled by large organizations, corporations, or internationalInternet Service Providers (ISPs).

DESCRIPTION OF THE INVENTION

[0006] To overcome the above and other disadvantages of the prior art,it is desired to provide methods and systems for determininggeographical regions of hosts in a network, such as the Internet.Accordingly, methods and systems are provided to determine thegeographical regions of hosts in a network. A region may be anygeographical area including, for example, a town, city, province, state,country, and/or continent.

[0007] To determine the geographical region of a host in a network,methods and systems consistent with the present invention preselect aplurality of sample hosts in the network such that the sample hosts arelocated in a plurality of geographical regions that can be determinedusing existing tools or other means. A plurality of monitors are thenprovided in the network to determine first sets of informationassociated with each of the sample hosts and second sets of informationassociated with the host whose geographical region is to be determined.

[0008] Each first set of information may include, for example, theround-trip time delay and number of hops (or hosts) on a route to asample host, as determined by a monitor. Likewise, each second set ofinformation may include, for example, the round-trip time delay andnumber of hops to the host whose geographical region is to bedetermined.

[0009] Other information about the sample hosts and the host may also bedetermined and included in the first and second sets of information,respectively. The information may include geographical regioninformation, such as the longitude and latitude of the last identifiablerouters on respective routes to the sample hosts and the host, asidentified by the plurality of monitors.

[0010] The geographical region of the host is then determined to be thesame as the geographical region of those sample hosts whose respectivemean of first sets of information has the shortest weighted vector (oreuclidian) distance from the second sets of information. A weightedvector distance may include, for example, a Mahalanobis distance from amean of the first sets of information to the second sets of information.

[0011] The description of the invention and the following descriptionfor carrying out the best mode of the invention should not restrict thescope of the claimed invention. Both provide examples and explanationsto enable others to practice the invention. The accompanying drawings,which form part of the description for carrying out the best mode of theinvention, show several embodiments of the invention, and together withthe description, explain the principles of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

[0012] In the Figures:

[0013]FIG. 1 is a block diagram of a network that includes a hostlocator and a plurality of monitoring stations for determininggeographical locations of hosts in the network, in accordance withmethods and systems consistent with the present invention;

[0014]FIG. 2 is a block diagram of a host locator, in accordance withmethods and systems consistent with the present invention;

[0015]FIG. 3 is a block diagram of a monitoring station, in accordancewith methods and systems consistent with the present invention;

[0016]FIG. 4 is a flowchart of the steps performed by one or moremonitoring stations for determining information about hosts in anetwork, in accordance with methods and systems consistent with thepresent invention; and

[0017]FIG. 5 is a flowchart of the steps performed by a host locator fordetermining the geographical region of a host in a network based onsample hosts information determined by one or more monitoring stationsin the network, in accordance with methods and systems consistent withthe present invention.

BEST MODE FOR CARRYING OUT THE INVENTION

[0018] Reference will now be made in detail to the preferred embodimentsof the invention, examples of which are illustrated in the accompanyingdrawings. Wherever possible, the same reference numbers will be usedthroughout the drawings to refer to the same or like parts.

[0019] In accordance with an embodiment of the invention, a host locatorand a plurality of monitoring stations are provided to determine thegeographical regions of one or more hosts in a network. The monitoringstations may be placed at different

[0020] Host locator 101 may include a processor, such as a personalcomputer with a network interface for sending and receiving informationvia one or more hosts 150 in network 100. Alternatively, host locator101 any include a personal computer in a stand-alone configuration notconnected to network 100.

[0021] The number of monitoring stations 110 in network 100 may be aninteger larger than, for example, 2. For simplicity, FIG. 1 shows threemonitoring stations, namely monitoring stations 110 ₁, 110 ₂, and 110 ₃.Furthermore, monitoring stations 110 ₁-110 _(N) may be capable ofsending and receiving information associated with network 100 via one ormore other hosts 150, 160, and 170 located in one or more geographicalregions. For example, monitoring station 110 ₁ may request and receivemetrics information associated with host 140 via hosts 150 and router120 ₁, one or it more of which may be located in different geographicalregions.

[0022] Monitoring stations 110 ₁-110 _(N) may be selectively placed atdifferent points in network 100 such that the monitoring stations mayrequest and receive a broad cross-section of information about hosts 130and 140. For example, as described below in detail, monitoring stations110 ₁-110 _(N) may each measure certain metrics associated with hosts130 and host 140, such as round-trip time delay and number of hosts orhops to the hosts 130 and host 140. To measure the metrics, monitoringstations 110 ₁-110 _(N) may invoke a Transmission Control Protocol overInternet Protocol (TCP/IP) utility, such as the traceroute() routine.

[0023] Routers 120 ₁-120 _(R) each may include a device that forwardspackets in network 100 based on network layer and routing tables, whichmay be constructed by routing protocols. For simplicity, FIG. 1 showsrouters 120 ₁, 120 ₂, 120 ₃, and 120 ₄, each of which connects host 140to the rest of network 100.

[0024] Hosts 130 and host 140 may each include any device identifiableby a network address, such as an IP address, and may include one or moreprocessors, such as a personal computer, workstation, local area network(LAN) server, microcomputer, minicomputer, mainframe, router, bridge,gateway, bank of modems owned by an Internet Service Provider (ISP),etc.

[0025]FIG. 2 is a block diagram of a host locator 101, in accordancewith methods and systems consistent with the present invention. Asshown, host locator 101 may include a processor 200, which may connectvia bus 210 to a memory 220, a secondary storage 230, a networkinterface module 240, and an input/output module 250.

[0026] Memory 220 may include a locator program 260, an operating system270, and a database 280. Locator program 260 may include software, whichprocessor 200 executes to determine geographical regions of one or morehosts, such as host 140 whose geographical region Z is to be determined.

[0027] Database 280 may include information, such as host names, hostaddresses, and metrics associated with hosts 130 in regions 1 through Mas measured by monitoring stations 110 ₁-110 _(N). A host name mayinclude a web name, such as “www.telcordia.com.” A host address mayinclude, for example, the IP address of the host in network 100. Themetrics may include, for example, round trip time delay and the numberof hosts or hops to hosts 130 and host 140, as measured from eachmonitoring station 110 ₁-110 _(N). The metrics may be downloaded overnetwork 100 from monitoring stations 110 ₁-110 _(N) onto database 280 ormay be manually loaded onto a tape or diskette at each monitoringstation 110 ₁-110 _(N) and then copied onto database 280 via secondarystorage 230.

[0028] Database 280 may also include geographical region information,such as the longitude and latitude of hosts 130 in regions 1 through M,and/or the longitude and latitude of the last identifiable routers onrespective routes to hosts 130, as identified by monitoring stations 110₁-110 _(N).

[0029] Secondary storage 230 may include a computer readable medium suchas a disk drive and a tape drive. From the tape drive, software and datamay be loaded onto the disk drive, which may then be copied onto memory220. Similarly, software and data in memory 220 may be copied onto thedisk drive, which may then be loaded onto the tape drive.

[0030] Network interface module 240 may include hardware and softwarefor sending and receiving information from monitoring stations 110 ₁-110_(N) over network 100.

[0031] Input/Output interface 250 may include, for example, a key boardor a key pad and a display unit.

[0032]FIG. 3 is a block diagram of a monitoring station, for examplemonitoring station 110 ₁, in accordance with methods and systemsconsistent with the present invention. As shown, monitoring station 110₁ may include a processor 300, which may connect via bus 310 to a memory320, a secondary storage 330, a network interface module 340, and aninput/output module 350.

[0033] Memory 320 may include a monitor program 360 and an operatingsystem 370. Monitor program 360 may include software, which processor300 executes to determine information, such as metrics and otherinformation associated with hosts 130 and 140 in network 100. Themetrics may include information, such as the round trip time delay andnumber of hosts or hops to hosts 130 and 140, as measured frommonitoring stations 110 ₁-110 _(N). The other information may includethe IP addresses of routers on respective routes to hosts 130 and 140,as identified by monitoring stations 110 ₁-110 _(N).

[0034] Secondary storage 330 may include a computer readable medium suchas a disk drive and a tape drive. From the tape drive, software and datamay be loaded onto the disk drive, which may then be copied onto memory320. Similarly, software and data in memory 320 may be loaded onto thedisk drive, which may then be loaded onto the tape drive.

[0035] Network interface module 340 may include hardware and softwarefor sending and receiving information from network 100.

[0036] Input/Output interface 350 may include, for example, a key boardor a key pad and a display unit.

[0037]FIG. 4 is a flow chart of the steps each monitoring station 110₁-110 _(N) performs to determine first sets of information associatedwith a sample of hosts in different geographical regions, such as hosts130 in regions 1 through M, in accordance with the methods and systemsconsistent with the invention. Consider monitor program 360 ofmonitoring station 110 ₁ shown in FIG. 3. First, monitor program 360 mayreceive from host locator 101 a request to determine metrics and otherinformation associated with a sample of hosts 130 and a host whosegeographical region Z is to be determined by host locator 101. Therequest may include the host addresses of the sample hosts 130 whosegeographical regions 1 through M are either predetermined or can bedetermined by host locator 101 and the host address of host 140 (step400). The request may also include an indication that monitor program360 determine portions or all of the requested information duringoff-peak network hours.

[0038] A user may preselect the sample of hosts 130 from hosts worldwideor from hosts in a particular region, such as a continent, country, orselect states or provinces within a country. The host names andaddresses associated with sample hosts 130 may be obtained from, forexample, Netsizer, a tool developed and made available on-line byTelcordia Technologies, Inc. at www.netsizer.com. A Netsizer databaseincludes the addresses and names of a large sample of hosts worldwide.The user may purchase from Telcordia Technologies, Inc. a copy of theNetsizer database and load portions or all of the information in thedatabase onto database 280 of host locator 101.

[0039] Monitor program 360 may determine a first set of information andother information associated with each sample host 130 (step 410). Forexample, monitor program 360 may invoke the traceroute( ) routine usingthe IP address of a sample host 130 (step 410). The traceroute( )routine returns a record that includes the round-trip time delay, numberof hops (or hosts) on a route to the sample host 130, and IP addressesof all hosts including the last identifiable router on the route to thesample host 130, as identified from monitoring station 110 ₁.

[0040] Based on the output of the traceroute( ) routine, monitor program360 may include in the first set of information the round-trip timedelay and number of hops on the route to sample host 130. Monitorprogram 360 may also determine other information associated with samplehost 130, such as the IP addresses of the last identifiable routers onrespective routes to sample hosts 130, as identified from monitoringstation 110 ₁. Monitor program 360 may repeatedly invoke the traceroute() routine to determine a first set of information and other informationfor all sample hosts 130 in regions 1 through M.

[0041] With respect to host 140, monitor program 360 may invoke thetraceroute( ) routine using the IP address of host 140 to determine asecond set of information and other information associated with host 140based on the host address of host 140 (step 420). Monitor program 360may include in the second set of information the round-trip time delayand number of hops from monitoring station 110 ₁ to host 140.Additionally, monitor program 360 may identify other informationassociated with host 140, such as the IP address of the lastidentifiable router on a route from monitoring station 110 ₁ to host140. In the embodiment of FIG. 1, the output of the traceroute( )routine may identify, for example, router 120, as the last identifiablerouter on a route from monitoring station 110 ₁ to host 140.

[0042] Finally, monitor program 360 may receive a request to download tohost locator 101 the first and second sets of information and otherinformation associated with sample hosts 130 and host 140 over network100 (step 430). Alternatively, a user may manually copy the informationdetermined by monitor program 360 onto a tape or diskette and load thatinformation onto database 280 in host locator 101. As stated above, theother monitoring stations 110 ₂-110 _(N) may also perform the steps400-430 described above.

[0043]FIG. 5 is a flowchart of the steps host locator 101 performs todetermine the geographical region of host 140, in accordance withmethods and systems consistent with the present invention. Locatorprogram 260 may send a request to each of the plurality of monitoringstations 110 ₁-110 _(N), requesting metrics and other informationassociated with sample hosts 130 and host 140 (step 500). The requestmay also include an indication that each monitoring station 110 ₁-110_(N) determines portions or all of the requested information duringoff-peak network hours.

[0044] Locator program 260 may receive the first and second sets ofinformation and other information from monitoring stations 110 ₁-110_(N) and store that information in database 280 (step 505), locatorprogram 260 may determine the geographical regions of sample hosts 130(step 510). Since some ISPs consistently include the geographical regioninformation of hosts in host names, monitor program 360 may parse thehost names (e.g., a web name) of sample hosts 130 stored in database 280to determine the geographical regions of those sample hosts 130 whosenames include this information. For example, Worldnet, an InternetService Provider, uses a host naming format that includes informationabout the geographical region where a host is located. Consider the hostname “los-angeles-12.ca.dial-access.att.net.” Locator program 260 mayparse out the terms “los-angeles” and “ca” to determine that the host islocated in the city of Los Angeles and the state of California.

[0045] Alternatively, if database 280 does not include sample host 130names, locator program 260 may invoke a nslookup( ) routine using the IPaddress of each sample host 130 to determine its associated host name.Locator program 260 may then parse the host name to determine thegeographical region of the sample host 130.

[0046] In addition, since geographical regions of certain entities, sucheducational institutions or universities are well known, locator program260 may also parse the host names to identify those names that includethe top level domain “.edu.” After identifying such host names, locatorprogram 260 may parse each identified host name to identify thecorresponding name of a university. Locator program 360 may then mapeach university name to a particular geographical region, such as acountry, city, state, etc.

[0047] For example, database 280 may include a mapping table (not shown)with each entry including a university name and its correspondinggeographical region. For each host name including the top level domain“.edu,” locator program 260 may parse out the name of the university andsearch the mapping table for a matching entry. If locator program 260finds a matching entry, it associates the geographical region identifiedin the matching entry to the host name. Otherwise, locator program 260may retrieve and parse the next host name in database 280.

[0048] After determining the geographical region of each sample host 130(i.e., regions 1 through M), locator program 260 may store thatinformation in database 280. Locator program 260 may then classify orindex the first sets of information and other information associatedwith sample hosts 130 in database 280 according to their respectivegeographical regions 1 through M (step 515).

[0049] Locator program 260 may merge the first sets of informationreceived from monitoring stations 110 ₁-110 _(N) for each sample host130 by geographical region (step 520). For example, the merged firstsets of information for sample host 130 i in region j may be representedas follows: $X_{ij} = \left\lbrack \quad \begin{matrix}t_{i1j} \\h_{i1j} \\t_{i2j} \\h_{i2j} \\\vdots \\t_{iNj} \\h_{iNj}\end{matrix}\quad \right\rbrack$

[0050] where j is an integer with values from 1 through M, andt_(i1j)-t_(iNj) and h_(i1j)-h_(iNj) represent the round-trip time delayand number of hops to sample host 130 i in geographical region j, asdetermined by monitoring stations 110 ₁-110 _(N), respectively.

[0051] Similarly, locator program 260 may merge the second sets ofinformation received from monitoring stations 110 ₁-110 _(N) (step 525).The merged second sets of information Y may be represented as follows:$Y = \left\lbrack \quad \begin{matrix}t_{1} \\h_{1} \\t_{2} \\h_{2} \\\vdots \\t_{N} \\h_{N}\end{matrix}\quad \right\rbrack$

[0052] where j is an integer with values from 1 through M, and t₁-t_(N)and h₁-h_(N) represent the round-trip time and number of hops to host140, as determined by monitoring stations 110 ₁-110 _(N), respectively.

[0053] Alternatively, before merging the first sets of information,locator program 260 may also determine geographical region informationfor each of the sample hosts 130 based on other information downloadedfrom monitoring stations 110 ₁-110 _(N). In this embodiment, in additionto city, state, country, or continent information, locator program 260may also determine other geographical region information, such as thelatitude and longitude associated with the last identifiable routers onrespective routes to sample hosts 130, as identified by monitoringstations 110 ₁-110 _(N). Stated another way, locator program 260 maydetermine the latitude and longitude of those routers with the shortesthops to sample hosts 130 as identified by monitoring stations 110 ₁-110_(N). For example, router 120 ₁ (shown in FIG. 1) may be identified bymonitoring station 110 ₁ as the last router on a route to each samplehost 130 in region 1, while router 120 ₄ may be identified by monitoringstation 110 ₂ as the last router on a route to each sample host 130 inregion 1.

[0054] Based on the IP addresses of the last identifiable routersreceived from monitoring stations 110 ₁-110 _(N), locator program 260may invoke, for example, the NetGeo tool to determine the longitude andlatitude of the identified routers. As explained above, the NetGeo toolis available through Cooperative Association of Internet Data Analysis(CAIDA) and may be accessed on-line at www.caida.org. For example,locator program 260 may invoke the NetGeo Application Program Interface(NetGeo API) to determine the longitude and latitude of the lastidentifiable routers on respective routes to sample hosts 130 in regions1 through M, as identified by monitoring stations 110 ₁-110 _(N).Further information about the NetGeo tool is available on web pagehttp://netgeo.caida.org/perl/netgeo.cgi.

[0055] NetGeo includes a database and collection of PERL scripts used tomap IP addresses, domain names, and autonomous system (AS) numbers togeographical regions. The NetGeo database includes tables for mappinglocation names (city, state, or country) or United States zip codes tolatitude and longitude information. When NetGeo receives a request todetermine the latitude and longitude of a domain name, NetGeo searchesthe database for a record containing the target domain name. If NetGeofinds a record for the target domain name, it returns the requestedlatitude and longitude information. If NetGeo does not find a matchingrecord, it performs one or more whois lookups using the whois servers ofthe Internet Information Network Center (InterNIC) and/or Reseaux 1PEuropeens (RIPE) until a whois record for the target domain name isfound. Additional information about NetGeo is available on-line atwww.caida.org.

[0056] Locator program 260 may add the geographical region informationdetermined above to the first sets of information and merge theresulting first sets of information. The merged first sets ofinformation for sample host 130 i in region j may be represented asfollows: $X_{ij} = \left\lbrack \quad \begin{matrix}t_{i1j} \\h_{i1j} \\{lo}_{i1j} \\{la}_{i2j} \\t_{i2j} \\h_{i2j} \\{lo}_{i2j} \\{la}_{i2j} \\\vdots \\t_{iNj} \\h_{iNj} \\{lo}_{iNj} \\{la}_{iNj}\end{matrix}\quad \right\rbrack$

[0057] where j is an integer with values from 1 through M,t_(i1j)-t_(iNj) and h_(i1j)-h_(iNj) are as described above, andlo_(i1j)-lo_(iNj) and la_(ilj)-la_(iNj) represent the longitude andlatitude of the last identifiable routers on respective routes to samplehost 130 i in geographical region j, as determined by monitoringstations 110 ₁-110 _(N).

[0058] Similarly, before merging the second sets of information, locatorprogram 260 may determine geographical region information, such as thelongitude and latitude of the last identifiable routers on respectiveroutes to host 140, as identified by monitoring stations 110 ₁-110 _(N).Locator program may add the geographical region information to thesecond sets of information and merge the resulting second sets ofinformation. The merged second sets of information Y may be representedas follows: $Y = \left\lbrack \quad \begin{matrix}t_{1} \\h_{1} \\{lo}_{1} \\{la}_{2} \\t_{2} \\h_{2} \\{lo}_{1} \\{la}_{2} \\\vdots \\t_{N} \\h_{N} \\{lo}_{N} \\{la}_{N}\end{matrix}\quad \right\rbrack$

[0059] where j is an integer with values from 1 through M, t₁-t_(N) andh₁-h_(N) are as described above and lo₁-lo_(N) and la₁-la_(N) representthe longitude and latitude of the last identifiable routers on routes tohost 140, as determined by monitoring stations 110 ₁-110 _(N),respectively.

[0060] Locator program 260 may then determine a mean vector {circumflexover (μ)}_(j) for the merged first sets of information for sample hosts130 in each geographical region j as follows (step 530):${{\hat{\mu}}_{j} = {\sum\limits_{i = 1}^{n_{j}}\quad {X_{ij}/n_{j}}}},$

[0061] where n_(j) represents the number of sample hosts 130 ingeographical region j.

[0062] Locator program 260 may also determine a covariance matrix

_(j) for the merged first sets of information for sample hosts 130 ineach geographical region j as follows (step 535): ∑ j  = 1 n j  ∑ i =1 n j     ( X ij - μ ^ j )  ( X ij - μ ^ j ) T ,

[0063] where^(T) signifies a transpose operation.

[0064] Based on the covariance matrices determined for geographicalregions 1 through M, locator program 260 may determine a weighted vector(or euclidian) distance d_(j) from each mean {circumflex over (μ)}_(j)vector to the merged second sets of information Y (step 540). Forexample, locator program 260 may determine the Mahalanobis distanced_(j) from each mean vector {circumflex over (μ)}_(j) to the mergedsecond sets of information Y as follows:$d_{j} = {\left( {{\hat{\mu}}_{j} - Y} \right)^{T}{\hat{\underset{j}{\sum\limits^{- 1}}}{\left( {{\hat{\mu}}_{j} - Y} \right).}}}$

[0065] Further information on determining a Mahalanobis distance isdisclosed in “Applied Multivariate Analysis,” S. James Press, HoltRinehart Winston, pp. 373-383, 1972, which is incorporated herein byreference.

[0066] Locator program 260 determines that the geographical region ofhost 140 is region j whose associated mean {circumflex over (μ)}_(j) hasthe shortest weighted vector distance from the merged second sets ofinformation Y (step 545). For example, locator program 260 may determinethat the geographical region of host 140 is region j whose associatedmean {circumflex over (μ)}_(j) has the shortest Mahalanobis distanced_(j) from Y.

[0067] While it has been illustrated and described what are at presentconsidered to be preferred embodiments and methods of the presentinvention, it will be understood by those skilled in the art thatvarious changes and modifications may be made, and equivalents may besubstituted for elements thereof without departing from the true scopeof the invention. One skilled in the art will appreciate that all orpart of the systems and methods consistent with the present inventionmay be stored on or read from computer-readable media, such as secondarystorage devices, like hard disks, floppy disks, and CD-ROM; a carrierwave received from a network such as the Internet; or other forms of ROMor RAM. This invention should be limited only by the claims andequivalents thereof.

[0068] In addition, many modifications may be made to adapt a particularelement, technique or implementation to the teachings of the presentinvention without departing from the central scope of the invention.Therefore, it is intended that this invention not be limited to theparticular embodiments and methods disclosed herein, but that theinvention include all embodiments falling within the scope of theappended claims.

What is claimed is:
 1. A method for determining a geographical region ofa host in a network, said method comprising the steps of: selectingother hosts in the network such that the selected other hosts arelocated in a plurality of geographical regions that are determinable;determining, at a plurality of points in the network, first sets ofinformation associated with the selected other hosts, respectively;determining, at the plurality of points, second sets of informationassociated with the host; and determining the geographical region of thehost based on the geographical region of one or more of the selectedother hosts whose respective mean of first sets of information has ashortest weighted vector distance from the second sets of information.2. The method of claim 1, wherein the step of determining the first setsof information comprises the step of: determining time delays incommunicating with the selected other hosts from the plurality ofpoints, respectively.
 3. The method of claim 1, wherein the step ofdetermining the first sets of information comprises the step of:determining numbers of hops in one or more routes in the network fromthe plurality of points to the selected other hosts, respectively. 4.The method of claim 1, further comprising the step of: determininggeographical information associated with last identifiable routers inrespective routes in the network from the plurality of points to theselected other hosts.
 5. The method of claim 4, wherein the step ofdetermining the geographical information comprises the step of:determining longitudes of the last identifiable routers in therespective routes.
 6. The method of claim 4, wherein the step ofdetermining the geographical information comprises the step of:determining latitudes of the last identifiable routers in the respectiveroutes.
 7. The method of claim 1, wherein the step of determining thesecond set of information comprises the step of: determining time delaysin communicating with the host from the plurality of the points,respectively.
 8. The method of claim 1, wherein the step of determiningthe second set of information comprises the step of: determining anumber of hops in each route in the network to the host from theplurality of the points, respectively.
 9. The method of claim 1, furthercomprising the step of: determining geographical information associatedwith last identifiable routers in respective routes in the network fromthe plurality of points to the host.
 10. The method of claim 9, whereinthe step of determining the geographical information comprises the stepof: determining longitudes of the last identifiable routers in therespective routes.
 11. The method of claim 9, wherein the step ofdetermining the geographical information comprises the step of:determining latitudes of the last identifiable routers in the respectiveroutes.
 12. The method of claim 1, further comprising the step of:receiving, from the plurality of points, the first sets of informationassociated with the selected other hosts; and merging the first sets ofinformation received for each of the other hosts.
 13. The method ofclaim 1, further comprising the step of: receiving, from the pluralityof points, the second sets of information associated with the host; andmerging the second sets of information received for the host.
 14. Themethod of claim 1, further comprising the steps of: parsing names of theselected other hosts to determine geographical information about theselected other hosts; and including the determined geographical regioninformation in the first sets of information.
 15. The method of claim 1,wherein the step of determining the geographical region of the hostcomprises the steps of: classifying the selected other hosts accordingto their respective geographical regions; determining mean vectors ofthe first sets of information associated with the classified selectedother hosts; and determining Mahalanobis distances of the determinedmean vectors from the second sets of information.
 16. The method ofclaim 15, further comprising the steps of: selecting one of thedetermined mean vectors with shortest Mahalanobis distance from thesecond sets of information; and determining the geographical region ofthe host to be same as the geographical region of the classifiedselected other hosts whose respective determined mean vector is theselected one of the determined means.
 17. A system, comprising: aplurality of first processors that determine first sets of informationassociated with a plurality of first hosts located in a plurality ofgeographical regions that are determinable, and determine second sets ofinformation associated with a second host whose geographical region isunknown; and at least a second processor that receives the first andsecond sets of information, determines means of the first sets ofinformation by geographical region, and determines the geographicalregion of the second host to be the same as the geographical region ofthe first hosts whose respective mean of first sets of information has ashortest weighted vector distance from the second sets of information.18. The system of claim 17, wherein the plurality of first processorsare placed at different points in a network that includes the pluralityof first hosts and the second host.
 19. The system of claim 17, whereinthe first sets of information include traceroute information associatedwith the plurality of first hosts, respectively.
 20. An apparatus,comprising: a memory including, program code that receives first sets ofinformation associated with a plurality of first hosts located in aplurality of geographical regions that are determinable, receives secondsets of information associated with a second host whose geographicalregion is unknown, and determines the geographical region of the secondhost to be the same as the geographical region of the first hosts whoserespective mean of first sets of information has a shortest weightedvector distance from the second sets of information; and a processorthat executes the program code.
 21. The apparatus of claim 20, whereinthe first sets of information includes time delays and number of hops tothe plurality of first hosts, as determined from a plurality of pointsin a network that includes the plurality of first hosts.
 22. Theapparatus of claim 20, wherein the second sets of information includestime delays and number of hops to the second host, as determined from aplurality of points in a network that includes the plurality of firsthosts and the second host.